SGI Developer Toolbox 6.1

home *** CD-ROM | disk | FTP | other *** search

/ SGI Developer Toolbox 6.1 / SGI Developer Toolbox 6.1 - Disc 4.iso / src / exampleCode / speech / lackey / README < prev next >

Wrap

Text File | 1994-08-02 | 5.8 KB | 136 lines

THIS CODE RUNS/COMPILES ON 5.1 Irix OR LATER THIS CODE WILL NOT RUN ON LESS THAN 5.1 Irix THIS DEMO IS BUILT WITH PRE-RELEASED DIGITAL MEDIA LIBRARY CODE. THE API AND FUNCTIONALITY ARE SUBJECT TO CHANGE. THE FINAL RELEASED VERSIONS OF THE DIGITAL MEDIA LIBRARIES WILL BE AVAILABLE AT THE END OF THIS YEAR BY ORDERING THE DIGITAL MEDIA LIBRARY DEVELOPMENT OPTION, "SC4-DEMDEV-1.2". THE SPEECH RECOGNITION DEVELOPER TOOLKIT IS AVAILABLE FROM: SCOTT INSTRUMENTS CORP. 1111 WILLOW SPRINGS DRIVE DENTON, TX 76205 TEL (817) 387-9514 FAX (817) 566-3174 ______________________________________________________________________________ ~4Dgifts/toolbox/src/exampleCode/speech/lackey README mags 04.04.94 lackey This is a speech recognition application example. It recognizes speech through the use of a speech recognition library. The example uses speech to launch desktop applications. Lackey has an internal list of "words" (words to be recognized) and their corresponding applications to be launched. Saying one of these words causes the execution of that application . The current set of commands recognize by lackey are clock, shell, apanel, and lackey, which can easily be extended. An audio capable system (Indigo, Indigo2, Indy) and a microphone are required. This example program was developed using SGI's speech recognition software. This is an introduction to using the speech C++ API and the architecture of the speech software system. INSTALLATION: Before you can begin using this application you must update your system software to include the speech components. Included on this edition of the Developers Toolbox are the inst modules for the speech execution and developer subsets. When installed will update your system with the speech DSO for your Xserver, the speech templates for existing applications (such as Showcase), sounds and images, the speech client library, include files, and sample programs. The speech server is part of the Xserver. Therefore, to activate the speech recognition restart your Xserver by logging out of your current session. After you are logged back into your active session, start the Speech Recognition panel : % srpanel & STARTING lackey: Be sure to set your apanel settings as follows: Input sampling rate at 8khz, the input device to microphone and the input level to 10. Now you can start lackey % lackey & TRAINING THE WORDS: None of the words that lackey will react to are known to the speech system You mustyou must train the recognizer to recognize them. Because this is a speaker independent system, the more different people that train the words, the better the recognizer will get at recognizing variances in the different speakers that use the system. Since you have never trained the words to be recognized by the lackey program the speech recognition panel will prompt you to train each of these new words one at a time. This will be denoted by a picture of a cat in the image window of the srpanel. Make sure the microphone at least 12 inches away from you. Slowly repeat the word (displayed prompt window of srpanel) at a normal tone until the word is recognized (at a minimum this will take 4 samples). You will see (1/4) near the word being trained. This represents that one of four valid samples have collected. Keep repeating the word until all four samples have been collected. Repeat this process for the rest of the words in lackeys vocabulary. If this process some how aborts or fails, you can use the Customization panel found in the srpanel's Recognizer pulldown menu. Features of SGI's upcoming release of speech technology: - speaker-independent discrete-utterance recognition - quick response (less than 200 msecs) - medium-sized vocabularies (50 active words at a time) - no extra hardware required - server-based - supports multiple speech application clients - supports networked speech application clients - handles focus policies for speech application clients - dispatches recognition events - audio samples processed only once by central server (for computational efficiency) - pretrained words and phrases - a suite of selected applications that are "speech-aware" - CASE tools - Showcase - Desktop - a vocabulary development system that allows the user to add, modify, and delete the words which can be recognized - a control panel through which the user can set behavioral characteristics such as the acceptance and rejection thresholds - a tool to generate actions for applications that are not "speech aware" (not listening for speech input) - documentation SGI's developers product for speech-applications also feature: - and API implemented with C and C++ speech headers & libraries - advanced vocabulary development tools - a database of pretrained words and phrases - documentation - programmers' guide - API specification - policy (style) guide for developing speech application behavior and vocabularies